Tag

#reinforcement learning

64 articles

The apps, gadgets, and tools every reader needs

This article explains the complex challenge of AI alignment, exploring theoretical frameworks and practical approaches for ensuring artificial intelligence systems behave beneficially and in accordance with human values.

Jul 187

The father of reinforcement learning is leaving Carmack to build his own AI

Richard Sutton, the father of reinforcement learning and 2024 Turing Award winner, is leaving Keen Technologies to start a new AI company, Oak Lab.

Jul 1422

Turing Award winner Rich Sutton founds Oak Lab to build AI agents that learn on their own

Turing Award winner Richard Sutton has founded Oak Lab to develop AI agents that learn autonomously from their environment, challenging current deep learning limitations.

Jul 1333

Skyfall AI Releases MORPHEUS: A Persistent Enterprise Simulation Benchmark That Makes Continual Reinforcement Learning Necessary Under Structured Non-Stationarity

Skyfall AI introduces MORPHEUS, a persistent enterprise simulation benchmark that challenges current continual reinforcement learning methods by simulating non-stationary environments.

Jul 1324

Siri AI is already changing how I use my iPhone

This explainer explores how iOS 27's enhanced Siri AI system represents a significant advancement in mobile artificial intelligence, combining transformer architectures, reinforcement learning, and edge computing techniques to create more sophisticated, adaptive voice assistants.

Jul 1329

tech

Prime Intellect Releases Verifiers v1: Composable Tasksets, Harnesses, and Runtimes for Agentic RL Training and Evaluations

Prime Intellect has released Verifiers v1, a modular platform for agentic reinforcement learning training and evaluations, featuring tasksets, harnesses, and runtimes.

Jul 1247

Stanford Researchers Introduce TRACE: A Capability-Targeted Agentic Training System That Turns Recurrent Agent Failures Into Synthetic RL Environment

Learn how TRACE, a new AI system from Stanford, helps AI agents learn from their own mistakes to become more capable and efficient.

Jul 1237

Guide to Loop Engineering: How ‘autoresearch’ and ‘Bilevel Autoresearch’ Turn AI Agents Into Autonomous Machine Learning ML Research Loops

This explainer explores loop engineering, a cutting-edge AI approach that enables autonomous machine learning research through iterative feedback loops. Learn how autoresearch and bilevel autoresearch allow AI agents to self-improve and discover new methodologies without human intervention.

Jul 1225

AI agents win at Slay the Spire 2 after researchers replace growing chat logs with structured memory

This article explains how structured memory systems in AI agents can improve efficiency and performance in complex environments like Slay the Spire 2, by replacing traditional unstructured chat logs with modular memory layers.

Jul 1131

Qwen’s Former Lead on What Hybrid Thinking Got Wrong — and Why He Now Backs Agents

This article explores the limitations of hybrid thinking in AI models and why researchers like Junyang Lin are now advocating for agentic thinking as a more robust and scalable approach.

Jul 429

Airwallex raises $320m at an $11bn valuation, betting on agentic finance

This explainer explores agentic finance, a cutting-edge field where AI agents autonomously manage financial tasks. Learn how reinforcement learning, deep learning, and transformer models enable these systems to make intelligent financial decisions.

Jun 2549

DeepReinforce Releases Ornith-1.0: An Open-Source Coding Model Family That Learns Its Own RL Scaffolds

DeepReinforce has released Ornith-1.0, an open-source coding model that learns its own reinforcement learning scaffolding during training. The 397B parameter flagship model achieved a score of 82.4 on SWE-Bench Verified.

Jun 2536